Learning to generate one-sentence biographies from Wikidata

نویسندگان

  • Ben Hachey
  • Will Radford
  • Andrew Chisholm
چکیده

We investigate the generation of onesentence Wikipedia biographies from facts derived from Wikidata slot-value pairs. We train a recurrent neural network sequence-to-sequence model with attention to select facts and generate textual summaries. Our model incorporates a novel secondary objective that helps ensure it generates sentences that contain the input facts. The model achieves a BLEU score of 41, improving significantly upon the vanilla sequence-to-sequence model and scoring roughly twice that of a simple template baseline. Human preference evaluation suggests the model is nearly as good as the Wikipedia reference. Manual analysis explores content selection, suggesting the model can trade the ability to infer knowledge against the risk of hallucinating incorrect information.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Learning to Generate Wikipedia Summaries for Underserved Languages from Wikidata

While Wikipedia exists in 287 languages, its content is unevenly distributed among them. In this work, we investigate the generation of open domain Wikipedia summaries in underserved languages using structured data from Wikidata. To this end, we propose a neural network architecture equipped with copy actions that learns to generate single-sentence and comprehensible textual summaries from Wiki...

متن کامل

Gender Gap Through Time and Space: A Journey Through Wikipedia Biographies and the "WIGI" Index

In this study we investigate how quantification of Wikipedia biographies can shed light on worldwide longitudinal gender inequality trends. We present an academic index allowing comparative study of gender inequality through space and time, the Wikipedia Gender Index (WIGI), based on metadata available through the Wikidata database. Our research confirms that gender inequality is a phenomenon w...

متن کامل

Evaluating the Success of the Visual Learners in Vocabulary Learning through Word List versus Sentence Making Approaches

Thisstudy sought to evaluate the learners' achievements with the visual learning style when exposed to the sentence making and word list approaches. On that account, 45 basic level participants who studied at the Iran Language Institute (ILI), Bushehr, took part in this research study. At the outset, the learners were given Barsch learning style inventory (1991) to determine the learners' learn...

متن کامل

Evaluating the Success of the Visual Learners in Vocabulary Learning through Word List versus Sentence Making Approaches.

Thisstudy sought to evaluate the learners'''' achievements with the visual learning style when exposed to the sentence making and word list approaches. On that account, 45 basic level participants who studied at the Iran Language Institute (ILI), Bushehr, took part in this research study. At the outset, the learners were given Barsch learning style inventory (1991) to determine the learners''''...

متن کامل

Unsupervised Models of Text Structure

Models of text structure are necessary for applications that generate text. These models provide information about what content fits together and how to organize the content as coherent text. In some domains such as newswire, biographies and stories for children, texts tend to have similar content and structure. Such regularities have allowed the development of unsupervised methods to learn tex...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017